# Efficient quantization
Baidu ERNIE 4.5 0.3B PT GGUF
Apache-2.0
A quantized version based on the Baidu ERNIE-4.5-0.3B-PT model, optimized through the llama.cpp tool to reduce the model size and improve the running efficiency.
Large Language Model Supports Multiple Languages
B
bartowski
314
3
BAAI RoboBrain2.0 7B GGUF
Apache-2.0
This is the quantization version of BAAI's RoboBrain2.0-7B model, which is quantized through llama.cpp and provides various quantization types to meet different hardware requirements.
Large Language Model
B
bartowski
448
3
Sophosympatheia StrawberryLemonade L3 70B V1.0 GGUF
StrawberryLemonade-L3-70B-v1.0 is a quantized large language model designed to run efficiently under different hardware conditions.
Large Language Model English
S
bartowski
1,406
1
Wan14bt2vfusionx Fp16 GGUF
Apache-2.0
Wan14BT2VFusionX is a text-to-video generation model that supports video generation through the ComfyUI - GGUF custom node.
Video Processing
W
lym00
133
0
Qwen3 0.6B GGUF
Apache-2.0
Qwen3 is the latest version of the Tongyi Qianwen series of large language models, offering a range of dense and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 has achieved breakthrough progress in reasoning capabilities, instruction following, agent functionalities, and multilingual support.
Large Language Model English
Q
prithivMLmods
290
1
Gemma 3 27b It Qat Unsloth Bnb 4bit
Gemma 3 is a lightweight, state-of-the-art multimodal open-source model launched by Google, capable of processing text and image inputs and generating text outputs.
Image-to-Text
Transformers

G
unsloth
2,591
1
3b Ko Ft Research Release Q4 K M GGUF
Apache-2.0
This is a 3B-parameter language model optimized for Korean, converted to GGUF format for compatibility with llama.cpp.
Large Language Model Korean
3
freddyaboulton
165
0
Gemma 3 4b It Qat Q4 0 Gguf
Gemma 3 is a lightweight open-source multimodal model family launched by Google, built on the same technology as Gemini, supporting text and image inputs and generating text outputs.
Image-to-Text
G
vinimuchulski
197
0
Cohereforai.c4ai Command R 08 2024 GGUF
The quantized version of the Command R model released by CohereForAI, aiming to make knowledge accessible to the public.
Large Language Model
C
DevQuasar
152
1
Gemma 3 4b It GGUF
Gemma 3.4B IT is a lightweight open-source large language model released by Google. Based on a parameter scale of 3.4B, it is suitable for dialogue and instruction following tasks.
Large Language Model
Transformers

G
tensorblock
395
0
Granite Embedding 107m Multilingual GGUF
Apache-2.0
A quantized version of the multilingual embedding model developed by the IBM Granite team, supporting text embedding tasks in 17 languages, suitable for scenarios such as retrieval and information extraction.
Text Embedding Supports Multiple Languages
G
bartowski
15.19k
1
Granite 8b Code Instruct 128k GGUF
Apache-2.0
IBM Granite 8B code instruction model, supporting a context length of 128k, focusing on code generation and instruction understanding tasks.
Large Language Model
Transformers Other

G
tensorblock
186
1
Qwen2.5 Coder 3B Instruct GGUF
Other
Based on the Qwen2.5-Coder-3B-Instruct model, quantization processing is performed, providing an efficient and convenient solution for code generation and dialogue interaction.
Large Language Model
Transformers Supports Multiple Languages

Q
gaianet
1,784
2
Nasiruddin15 Mistral Dolphin 2.8 Grok Instract 2 7B Slerp GGUF
This is a 7B parameter model based on the Mistral architecture, optimized through quantization, offering various GGUF quantization versions to meet different hardware requirements.
Large Language Model
N
featherless-ai-quants
127
2
Molmo 7B O Bnb 4bit
Apache-2.0
The 4-bit quantized version of Molmo-7B-O, significantly reducing the memory requirement and suitable for environments with limited resources.
Large Language Model
Transformers

M
cyan2k
2,467
11
Llama 3.2 1B Instruct GGUF
The GGUF format version of Llama-3.2-1B-Instruct, providing broader support and better performance.
Large Language Model
L
MaziyarPanahi
190.76k
12
Openchat 3.6 8b 20240522 IMat GGUF
This is a version of the openchat/openchat-3.6-8b-20240522 model after Llama.cpp imatrix quantization. It provides files of different quantization types, making it convenient for users to download and use according to their needs.
Large Language Model
O
legraphista
4,416
1
Deepseek V2 Lite IMat GGUF
The GGUF quantized version of DeepSeek-V2-Lite, processed by Llama.cpp imatrix quantization, reduces storage and computing resource requirements and facilitates deployment.
Large Language Model
D
legraphista
491
1
Deepseek V2 Chat GGUF
MIT
The GGUF quantized version of DeepSeek-V2-Chat, suitable for local deployment and operation.
Large Language Model Supports Multiple Languages
D
leafspark
1,388
27
Mixtral 8x7B V0.1 Turkish GGUF
Apache-2.0
A model fine-tuned on a specific Turkish dataset, capable of accurately answering information in Turkish and providing strong support for Turkish-related text generation tasks.
Large Language Model
Transformers Supports Multiple Languages

M
sayhan
180
3
Featured Recommended AI Models